Lab 05 - Camera: Cones classification

Robotics II

Poznan University of Technology, Institute of Robotics and Machine Intelligence

Laboratory 5: Cones classification using Camera sensor

In this course, you will train a classification algorithm as the next stage of object detection. Your classes will be in three colours: orange, blue and yellow cones.

Part I - train the cone classificator

First, follow the interactive tutorial. It uses the SqueezeNet model for object classification. Remember to save exported onnx model from the final step.

Part II - build the inference pipeline

Now, we use the detector from the previous course and the estimator from Part I to perform inference on real video from the onboard autonomous racecar.

Publishing video from file to ROS topic

Let’s download a splitted part of the move above and copy it into the docker container.
Run the ROS master instance.
```
roscore
```
Run the rviz tool
```
rosrun rviz rviz
```

Publish video for camera_left topic

python scripts/publish_video.py ./kit19d.mp4 --width 608 --height 608 --fps 30

Display video inside the rviz tool. Follow: Add -> By topic -> /fsds/camera/cam_left -> Image -> Ok
If all works, close the rviz node.

Cone classification inference

Note: Currently, your workspace is the vision_detector.py file from the previous course.

Copy the cone_clf.onnx model info the docker container. Verify the path to the model in config/config.yaml.

Add to the vision script new class:

class ColorConeClf:
  def __init__(self, cone_clf_path):
      self.ort_sess = ort.InferenceSession(cone_clf_path, providers=['CPUExecutionProvider'])
      self.in_names = [inp.name for inp in self.ort_sess.get_inputs()]

  def preprocess(self, rois: np.ndarray) -> np.ndarray:
      #### TODO: fill the preprocess method

      return rois

  def predict(self, rois):
      rois = self.preprocess(rois)
      outputs = self.ort_sess.run(None, {self.in_names[0]: rois})

      #### TODO: perform post-processing to get categorical output

      return labels

Write the preprocess method:
- iterate over all ROIs and resize it to (28, 40) resolution,
- convert the list into NumPy array
- convert the type of array to np.float32
- resize array values to <0,1> range
Write the post-process code inside predict method:
- convert one-shot representation into categorical values
Create ColorConeClf instance in VisionDetector constructor. It should get cone_clf_path parameter.
In camera_detection_callback function, between filtering and self.postprocess_cones do:
- convert values of boxes to the np.int32 type,
- loop over boxes and crop ROIs from im array,
- perform predict method of ColorConeClf.
Update the boxes visualization loop to draw rectangles in the proper colour.
Launch the vision_detector node.

roslaunch fsds_roboticsII vision_detector.launch

And run the video publisher again.

Simulator recording

Download the rosbag recording from the simulator and move it into the container.
Launch the vision_detector node again.
Replay the simulation. In the rosbag, the name of the video topic does not match our script. Therefore, we have to map it from right to left camera.

rosbag play ./cam_acc.bag /fsds/camera/cam_right:=/fsds/camera/cam_left

TASK

Your task is to draw the text “STOP” after the finish line. Simplify, draw the text if all classes of boxes are changed from yellow/blue to orange. Tips: * use np.all to handle NumPy logical operations * use cv2.putText to add text on image

As a result, upload a screenshot at the finish line from the rviz tool to the eKursy platform.